Automatic camera control using unobtrusive vision and audio tracking
نویسندگان
چکیده
While video can be useful for remotely attending and archiving meetings, the video itself is often dull and difficult to watch. One key reason for this is that, except in very high-end systems, little attention has been paid to the production quality of the video being captured. The video stream from a meeting often lacks detail and camera shots rarely change unless a person is tasked with operating the camera. This stands in stark contrast to live television, where a professional director creates engaging video by juggling multiple cameras to provide a variety of interesting views. In this paper, we applied lessons from television production to the problem of using automated camera control and selection to improve the production quality of meeting video. In an extensible and robust approach, our system uses off-the-shelf cameras and microphones to unobtrusively track the location and activity of meeting participants, control three cameras, and cut between these to create video with a variety of shots and views, in real-time. Evaluation by users and independent coders suggests promising initial results and directions for future work.
منابع مشابه
Relative Position Sensing and Automatic Control for Observation in the Midwater by an Underwater Vehicle
A vision-based automatic tracking and observation system installed on an ROV has successfully tracked midwater ocean animals (such as jellyfish) in Monterey Bay, California. This system uses stereo vision to localize the tracking vehicle with respect to the target of interest and closes control loops to maintain the target in the views of the cameras. The vision and control algorithms have been...
متن کاملA Fast, Robust, Automatic Blink Detector
Introduction “Blink” is defined as closing and opening of the eyes in a small duration of time. In this study, we aimed to introduce a fast, robust, vision-based approach for blink detection. Materials and Methods This approach consists of two steps. In the first step, the subject’s face is localized every second and with the first blink, the system detects the eye’s location and creates an ope...
متن کامل3d Lip Tracking and Co-inertia Analysis for Improved Robustness of Audio-video Automatic Speech Recognition
Multimodality is a key issue in robust humancomputer interaction. The joint use of audio and video speech variables has been shown to improve the performance of automatic speech recognition (ASR) systems. However, robust methods in particular for the real-time extraction of video speech features are still an open research area. This paper addresses the robustness issue of audio-video (AV) ASR s...
متن کامل3d Lip Tracking and Co-inertia Analysis for Improved Robustness of Audio-video Automatic Speech Recognition
Multimodality is a key issue in robust humancomputer interaction. The joint use of audio and video speech variables has been shown to improve the performance of automatic speech recognition (ASR) systems. However, robust methods in particular for the real-time extraction of video speech features are still an open research area. This paper addresses the robustness issue of audio-video (AV) ASR s...
متن کاملA Study on Object Tracking Signal Generation of Pan, Tilt, and Zoom Data
CCD cameras monitoring a moving object generally operate with a fixed view point or a defined pattern. As a result, a free-moving object can move out of the CCD camera's field of vision quickly, and automatic control is required to observe the object continuously. This paper proposes a signal generation algorithm for the automatic control of a CCD camera. Using the control signal, the monitorin...
متن کامل